A pair-to-pair amino acids substitution matrix and its applications for protein structure prediction.

نویسندگان

  • Eran Eyal
  • Milana Frenkel-Morgenstern
  • Vladimir Sobolev
  • Shmuel Pietrokovski
چکیده

We present a new structurally derived pair-to-pair substitution matrix (P2PMAT). This matrix is constructed from a very large amount of integrated high quality multiple sequence alignments (Blocks) and protein structures. It evaluates the likelihoods of all 160,000 pair-to-pair substitutions. P2PMAT matrix implicitly accounts for evolutionary conservation, correlated mutations, and residue-residue contact potentials. The usefulness of the matrix for structural predictions is shown in this article. Predicting protein residue-residue contacts from sequence information alone, by our method (P2PConPred) is particularly accurate in the protein cores, where it performs better than other basic contact prediction methods (increasing accuracy by 25-60%). The method mean accuracy for protein cores is 24% for 59 diverse families and 34% for a subset of proteins shorter than 100 residues. This is above the level that was recently shown to be sufficient to significantly improve ab initio protein structure prediction. We also demonstrate the ability of our approach to identify native structures within large sets of (300-2000) protein decoys. On the basis of evolutionary information alone our method ranks the native structure in the top 0.3% of the decoys in 4/10 of the sets, and in 8/10 of sets the native structure is ranked in the top 10% of the decoys. The method can, thus, be used to assist filtering wrong models, complementing traditional scoring functions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rapid assessment of correlated amino acids from pair-to-pair (P2P) substitution matrices

UNLABELLED Identification of correlated amino acids in proteins has been a topic of broad interest in view of its functional implications and importance in protein design. A new set of pair-to-pair (P2P) substitution matrices for amino acids was recently introduced as a useful tool for inferring information on such correlated sites. We present a website developed for automated application of th...

متن کامل

Prediction of Protein Sub-Mitochondria Locations Using Protein Interaction Networks

Background: Prediction of the protein localization is among the most important issues in the bioinformatics that is used for the prediction of the proteins in the cells and organelles such as mitochondria. In this study, several machine learning algorithms are applied for the prediction of the intracellular protein locations. These algorithms use the features extracted from pro...

متن کامل

Prediction of Protein Secondary Structure Based on Residue Pairs

The GOR program for predicting protein secondary structure is extended to include triple correlation. A score system for a residue pair to be at certain conformation state is derived from the conditional weight matrix describing amino acid frequencies at each position of a window flanking the pair under the condition for the pair to be at the fixed state. A program using this score system to pr...

متن کامل

FTIR Biospectroscopy Investigation on Cisplatin Cytotoxicity in Three Pairs of Sensitive and Resistant Cell Line

Fourier Transformed Infrared Spectroscopy (FTIR) has extensively been used for biological applications. Cisplatin is one the most useful antineoplastic chemotherapy drugs for a variety of different human cancers. One of the clinical problems in its application, which would consequently affect the therapeutic outcome of its application, is the occurrence of resistance to this agent. In this proj...

متن کامل

A 3D-1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequence.

In protein fold recognition, a probe amino acid sequence is compared to a library of representative folds of known structure to identify a structural homolog. In cases where the probe and its homolog have clear sequence similarity, traditional residue substitution matrices have been used to predict the structural similarity. In cases where the probe is sequentially distant from its homolog, we ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Proteins

دوره 67 1  شماره 

صفحات  -

تاریخ انتشار 2007